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Stochastic  Control  problems  in  Mobile 
Communications 

Final  Report,  Army  Research  office,  Dec.  2002 
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Paul  Dupuis  and  Harold  J.  Kushner 

Applied  Mathematics  Dept. 

Brown  University 

•  Convergence  of  Proportional-Fair  Sharing  Algorithms  Under  Gen¬ 
eral  Conditions  [1].  We  were  concerned  with  the  allocation  of  the 
base  station  transmitter  time  in  time  varying  mobile  communications  with 
many  users  who  are  transmitting  data.  Time  is  divided  into  small  schedul¬ 
ing  intervals,  and  the  channel  rates  for  the  various  users  are  available  at 
the  start  of  the  intervals.  Since  the  rates  vary  randomly,  there  is  a  con¬ 
flict  between  full  use  (by  selecting  the  user  with  the  highest  current  rate) 
and  fairness.  The  Proportional  Fair  Scheduler  (PFS)  of  the  Qualcomm 
High  Data  Rate  (HDR)  system  and  related  algorithms  are  designed  to 
deal  with  such  conflicts.  The  aim  was  to  put  such  algorithms  on  a  sure 
mathematical  footing  and  analyze  their  behavior.  Such  algorithms  are  of 
the  stochastic  approximation  type  and  results  of  stochastic  approximation 
are  used  to  analyze  the  long  term  properties.  It  was  shown  that  the  limit¬ 
ing  behavior  of  the  sample  paths  of  the  throughputs  is  well  approximated 
by  the  solution  of  an  intuitively  reasonable  ordinary  differential  equation, 
which  is  akin  to  a  mean  flow.  It  was  shown  that  the  ODE  has  a  unique 
equilibrium  and  that  it  is  characterized  as  optimizing  a  concave  utility 
function,  which  shows  that  PFS  is  not  ad-hoc,  but  actually  corresponds 
to  a  reasonable  maximization  problem.  These  results  may  be  used  to  an¬ 
alyze  the  performance  of  PFS.  The  results  depend  on  the  fact  that  the 
mean  ODE  has  a  special  form  that  arises  in  problems  with  certain  types 
of  competitive  behavior.  There  is  a  large  set  of  such  algorithms,  each  one 
corresponding  to  a  concave  utility  function.  This  set  allows  a  choice  of 
tradeoffs  between  the  current  rate  and  throughout.  Extensions  to  multi¬ 
ple  antenna  and  frequency  systems  are  given.  Finally,  the  infinite  backlog 
assumption  is  dropped  and  the  data  is  allowed  to  arrive  at  random.  This 
complicates  the  analysis,  but  the  same  results  hold. 

•  Wireless  systems  with  time  varying  channels  [2].  Consider  the  for¬ 
ward  link  of  a  system  with  K  remote  units  and  a  single  base  transmitter 
with  time  varying  connecting  channels.  Data  to  be  transmitted  to  the 
remote  units  arrives  according  to  some  random  process  and  is  queued  ac¬ 
cording  to  its  destination.  Power  is  to  be  allocated  to  the  K  channels  in  a 
queue  and  channel  state  dependent  way  to  minimize  some  cost  criterion. 
The  channel  fading  rate  is  fast  and  the  bandwidth  and  data  arrival  rates 
are  high.  Owing  to  the  high  speed  the  fading,  arrival,  and  service  rates, 
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an  asymptotic  or  averaging  of  the  heavy  traffic  type  method  is  promis¬ 
ing.  By  heavy  traffic,  we  mean  that  on  the  average  there  is  little  server 
idle  time  and  spare  power  over  the  “average”  requirements.  Heavy  traffic 
analysis  has  been  very  helpful  in  simplifying  control  problems  in  queueing 
and  communications  networks.  It  eliminates  inessential  detail  and  focuses 
on  the  fundamental  issues  of  scaling  and  parametric  dependencies.  To  il¬ 
lustrate  the  scope  of  the  method,  several  models  are  considered.  The  basic 
model  assumes  that  the  channel  state  is  known,  and  that  given  the  chan¬ 
nel  state  there  is  a  well  defined  rate  of  transmission  per  unit  power.  Then 
convergence  of  the  controlled  scaled  queue  lengths  is  shown.  The  scaling 
is  different  from  the  usual  in  heavy  traffic  work.  The  appropriate  orders  of 
reserve  power  and  buffer  size  are  given  as  well  as  suggested  policies.  The 
approximating  process  is  a  controlled  reflected  diffusion  which  is  simpler 
than  the  original  problem  and  facilitates  understanding  parametric  depen¬ 
dencies,  solutions  and  stability.  The  averaging  is  robust  and  can  be  done 
under  a  great  variety  of  conditions.  To  illustrate  the  scope  of  the  method, 
more  complicated  systems  are  also  treated;  e.g,  power  scheduling  is  done 
at  prescribed  intervals  only,  knowledge  of  the  channel  might  be  subject  to 
random  errors,  retransmissions  might  be  needed  due  to  excessive  errors 
at  the  receiver,  or  the  system  might  be  of  the  TDMA  type.  The  methods 
and  approximating  control  problems  are  similar. 

•  Stability  and  Control  of  Mobile  Communications  Systems  With 
Time  Varying  Channels  [3].  Consider  the  forward  link  of  a  mobile 
communications  system  with  a  single  transmitter  and  rather  arbitrary 
randomly  time  varying  channels  connecting  the  base  to  the  mobiles.  Data 
arrives  at  the  base  in  some  random  way  and  is  queued  according  to  the 
destination  until  transmitted.  The  main  issues  are  the  allocation  of  trans¬ 
mitter  power  and  time  to  the  various  queues  in  a  queue-  and  channel- 
state  dependent  way  to  assure  stability  and  good  operation.  The  control 
decisions  are  made  at  the  beginning  of  the  (small)  scheduling  intervals. 
Stability  methods  are  used  to  allocate  time  and  power.  Many  schemes  of 
current  interest  can  be  handled:  For  example,  CDMA  with  control  over 
the  bit  interval  and  power  per  bit,  TDMA  with  control  over  the  time  allo¬ 
cated,  power  per  bit,  and  bit  interval,  as  well  as  arbitrary  combinations. 
The  details  of  the  scheme  are  not  directly  involved;  all  essential  factors  are 
incorporated  into  a  “rate”  and  “error”  function.  There  might  be  random 
errors  in  transmission  which  require  retransmission.  The  channel-state 
process  might  be  known  or  only  partially  known.  The  system  and  channel 
process  are  scaled  by  speed.  A  fluid  approximation  is  derived  for  a  canon¬ 
ical  model,  via  a  weak  convergence  analysis.  Under  a  stability  assumption 
on  the  fluid  model  and  some  other  natural  conditions,  it  is  shown  that 
the  physical  system  can  be  controlled  to  be  stable,  uniformly  in  the  speed. 
Owing  to  the  non-Markov  nature  of  the  problem,  we  use  the  perturbed 
Liapunov  function  method,  which  is  very  useful  for  the  analysis  of  non- 
Markovian  systems.  Finally,  the  perturbed  Liapunov  method  is  used  to 
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actually  choose  the  power  and  time  allocations.  The  allocation  will  depend 
on  the  Liapunov  function.  But  each  such  function  corresponds  loosely  to 
an  optimization  problem  for  some  performance  criterion.  Since  there  is  a 
choice  of  Liapunov  functions,  various  performance  criteria  can  be  taken 
into  account  in  the  allocations.  The  power  of  the  method  is  due  to  the 
rather  general  conditions  under  which  it  works. 

Large  Deviation  Approximations  for  Occupancy  Problems.  The 
paper  [4]  was  completed.  In  the  occupancy  problem  one  considers  the 
distribution  of  r  balls  in  n  cells,  with  each  ball  assigned  independently  to 
a  given  cell  with  probability  1/n.  In  the  paper  just  mentioned  a  large  de¬ 
viation  approximation  as  r  and  n  tend  to  infinity  was  proved.  Occupancy 
problems  have  many  applications  in  computer  science,  mathematical  biol¬ 
ogy  and  elsewhere,  but  the  original  motivation  for  this  work  is  the  design 
of  an  optical  communication  switch,  where  blocking  probabilities  are  of 
central  importance.  Here  the  urns  correspond  to  channels  and  the  balls  to 
packets  being  routed  through  the  switch.  In  order  to  analyze  the  problem 
a  dynamical  model  is  first  considered,  where  the  balls  are  placed  in  the 
cells  sequentially  and  “time”  corresponds  to  the  number  of  balls  that  have 
already  been  thrown.  A  complete  large  deviation  analysis  of  this  “process 
level”  problem  is  carried  out,  and  the  rate  function  for  the  original  problem 
is  then  obtained  via  the  contraction  principle.  The  variational  problem 
that  characterizes  this  rate  function  is  analyzed,  and  (in  sharp  contrast  to 
most  analyses  of  large  deviations  for  Markov  processes),  a  complete  and 
explicit  solution  is  obtained.  The  minimizing  trajectories  and  minimal 
cost  are  identified  up  to  two  constants,  and  the  constants  are  character¬ 
ized  as  the  unique  solution  to  an  elementary  fixed  point  problem.  These 
results  are  then  used  to  solve  a  number  of  interesting  problems,  including 
the  overflow  problem  and  the  so-called  partial  coupon  collector’s  problem. 

Extensions  of  this  work  will  consider  urn  problems  with  balls  of  differing 
colors.  One  can  define  a  mapping  that  turns  the  occupancy  process  for 
colored  balls  into  a  process  for  random  motion  on  a  lattice,  which  we  refer 
to  as  a  migration  process.  With  some  further  extension  of  the  underlying 
extremals,  the  corresponding  solutions  to  the  variational  problems  for  the 
migration  process  can  be  identified.  Migration  processes  are  relevant  to 
wireless  networks,  in  particular  to  ad  hoc  networks  where  the  statistical 
properties  of  the  mobile  node  positions  with  respect  to  one  another  are 
related  to  the  communication  capabilities  of  the  network. 


Regulation  and  Analysis  of  Stochastic  Networks.  Research  in  this 
area  proceeded  along  several  directions.  Key  issues  in  our  investigations 
included:  (i)  the  development  of  approximate  models  that  allow  for  ex¬ 
plicit  (or  nearly  explicit)  construction  of  the  optimal  routing/service  poli¬ 
cies,  and  (ii)  robustness  and  the  ability  to  deal  with  model  perturbations. 
The  problems  considered  encompass  a  number  of  different  formulations, 
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each  of  which  emphasizes  different  qualitative  properties  of  the  resulting 
controlled  network.  These  include  risk-sensitive  control  and  the  control 
of  rare  events,  optimal  control  of  “fluid”  models,  and  optimally  robust 
control  of  such  models.  “Fluid”  models  are  approximate  models  that  are 
obtained  under  a  law  of  large  numbers  scaling.  In  all  cases  the  system 
model  is  constrained,  and  in  most  problems  this  is  a  dominant  feature.  To 
handle  the  constraints,  we  model  the  state  dynamics  of  the  approximate 
(or  limit)  models  in  terms  of  an  appropriate  Skorokhod  Problem  and  cor¬ 
responding  constrained  ordinary  or  stochastic  differential  equations.  The 
different  problem  setups  (e.g.,  control  of  rare  events  versus  control  of  fluid 
models)  lead  to  problems  of  the  same  basic  form,  which  is  a  Skorokhod 
Problems  with  (relatively)  simple  dynamics  and  simple  cost  structures. 

Paper  [5]  considers  the  problem  of  characterizing  the  manner  in  which 
rare  events  occur  in  networks  under  heavy  traffic.  It  shows  how  time 
reversal  arguments  can  be  applied  to  rewrite  the  control  problem  defined 
by  a  straightforward  large  deviations  analysis  into  the  form  of  [5],  and 
then  shows  how  explicit  finite-dimensional  solutions  can  be  obtained  for 
some  interesting  classes  of  problems  (in  arbitrary  dimension).  A  three 
dimensional  example  is  worked  out  in  detail  for  illustrative  purposes. 

The  paper  [6]  considers  the  robust  optimal  control  of  a  law  of  large  num¬ 
bers  approximation  of  a  stochastic  network.  The  robust  control  problem 
is  formulated  as  a  dynamic  differential  game,  with  one  player  choosing  the 
policies  that  determine  service  and  routing  assignments,  and  the  other 
choosing  quantities  such  as  the  arrival  and  service  rates,  subject  to  con¬ 
straints.  The  cost  to  be  minimized  by  the  first  player  and  maximized  by 
the  second  is  the  time  till  the  origin  is  reached.  The  robust  formulation 
allows  one  to  differentiate  between  the  many  policies  (at  the  fluid  level) 
that  are  optimal  for  an  ordinary  cost.  An  explicit  formula  is  given  for  the 
value  function,  and  some  of  its  basic  properties  are  studied.  The  problem 
of  policy  synthesis  for  particular  classes  of  problems  has  also  been  stud¬ 
ied,  and  for  these  cases  we  have  explicitly  constructed  a  robustly  optimal 
control. 


•  Importance  sampling  and  simulation.  The  design  of  efficient  simu¬ 
lation  schemes  is  an  important  consideration  in  the  design  and  analysis 
of  stochastic  systems  and  stochastic  networks.  The  paper  [7]  considers 
fundamental  issues  related  to  the  technique  known  as  “importance  sam¬ 
pling.”  A  standard  heuristic  for  importance  sampling  is  that  the  changes 
of  measure  used  to  prove  large  deviation  lower  bounds  give  good  perfor¬ 
mance  when  used  for  importance  sampling.  Recent  work,  however,  has 
suggested  that  the  heuristic  is  incorrect  in  many  situations.  The  perspec¬ 
tive  put  forth  in  [7]  is  that  large  deviation  theory  suggests  many  changes 
of  measure,  and  that  not  all  are  suitable  for  importance  sampling.  In  the 
setting  of  Cramer’s  Theorem,  the  traditional  interpretation  of  the  heuristic 
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suggests  a  fixed  change  of  distribution  on  the  underlying  independent  and 
identically  distributed  summands.  In  contrast,  [7]  considers  importance 
sampling  schemes  where  the  exponential  change  of  measure  is  adaptive, 
in  the  sense  that  it  depends  on  the  historical  empirical  mean.  The  exis¬ 
tence  of  asymptotically  optimal  schemes  within  this  class  is  demonstrated. 
The  result  indicates  that  an  adaptive  change  of  measure,  rather  than  a 
static  change  of  measure,  is  what  the  large  deviations  analysis  truly  sug¬ 
gests.  The  proofs  utilize  a  control-theoretic  approach  to  large  deviations, 
which  naturally  leads  to  the  construction  of  asymptotically  optimal  adap¬ 
tive  schemes  in  terms  of  a  limit  Bellman  equation.  Numerical  examples 
contrasting  the  adaptive  and  standard  schemes  are  presented,  as  well  as 
an  interpretation  of  their  different  performances  in  terms  of  differential 
games. 
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